- Eportfolio Login
- Jobs/Scholarships
- IRLS504 Section
- Knowledge River
- Masters Regulations
- SIRLS on Flickr
- SIRLS on iTunes-U
CONTACT US
School of Information Resources
and Library Science
1515 East First Street
Tucson, AZ 85719
Tel: (520) 621-3565
Fax: (520) 621-3279
sirls@email.arizona.edu
Department Director
Dr. Bryan Heidorn
1515 East First Street
Tucson, AZ 85719
Tel: (520) 621-3565
Fax: (520) 621-3279
heidorn@email.arizona.edu
-
Assistant Professor, Information Technologies
Telephone: 520-621-3565
Courses Taught: click HERE
-
Degree(s):
Institution: University of Illinois at Urbana-Champaign. Degree: Ph.D. Date: 2005. Dissertation: “Automating Semantic Markup of Semi-Structured Text Via An Induced Knowledge Base: A Case-Study Using Floras”. Advisor: Prof. Linda C. Smith. Field: Library and Information Science.
Institution: University of Illinois at Urbana-Champaign. Degree: M.C.S. Date: 2002. Field: Computer Science.
Institution: Chinese Academy of Sciences. Degree: M.S. Date:1997. Thesis: “A Study of Citation Behavior of Chinese Scientists”. Advisor: Prof. Yitai Gong. Field: Information Science.
Institution: Tongji Medical University. Degree: B.A. Date: 1994. Thesis: “Why Citation Analysis”. Advisor: Prof. Huiji Qin. Field: Medical Information and Librarianship.
Academic Honours: (not research grants)
"Outstanding Full-Time Faculty" May 2009, School of Information Resources and Library Science, University of Arizona
"Most Progressive Full-Time Faculty" Spring 2009 award by Progressive Librarians Guild SIRLS chapter, University of Arizona.
Elected into full membership of Beta Phi Mu International Library and Information Studies Honor Society. 2007-
The Berner Nash Memorial Award for outstanding doctoral dissertation, May 2005. Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign. Dissertation title: Automating Semantic Markup of Semi-Structured Text via an Induced Knowledge Base: A Case-Study Using Floras. Completed April 2005.
The Incomplete List of Teachers Ranked as Excellent by Their Students, Fall 2004 for LIS390W1A, University of Illinois at Urbana-Champaign.
Jean Tague-Sutcliffe Award, Association for Library and Information Science Education, 2004
GSLIS Fellowship, UIUC 2002-2003.
Research Interests:
Dr. Cui's research focuses on machine learning applications for semantic annotation of semi-structured information, with a current focus on biodiversity literature. She develops and evaluates machine learning algorithms for converting born-digital and digitized taxonomic descriptions into new Semantic Web formats (e.g., XML, RDF). Her work has an explicit impact on how scientific information can be retrieved and used in the digital era by turning the wealth of human-readable scientific information into something that can be understood and read by computers. She is the principal investigator of a National Science Foundation-funded project (EF-0849982) entitled "Fine-Grained Semantic Markup of Descriptive Data for Knowledge Applications in Biodiversity Domains" for a total of $700,452 (2009-2012). The Flora of North America Project also provided multiple grants for her to assist in the transformation of the way the wealth of the knowledge in FNA may be accessed and used. The methodology developed by Dr. Cui has been adopted by several other research groups in the US and abroad.
Selected Publications:
Duan, Y, Hei, Z, Ju, F., Cui, H. (2012). Study on Semantic Markup of Species Description Text in Chinese Based on Auto-Learned Rules. New Techonology of Library and Information Services (Chinese). 2012 (5).
Duan, Y, Hei, Z, Ju, F., Cui, H. (2012). Semantic Annotation of Species Description Text in Chinese Literature by Naive Bayes Classifier. Journal of the China Society for Scientific and Technical Information. 31, (8), 805-812.
Cui, H., Dusenbery, A., Morris, R.A., Macklin, J., & Huang, F (2012). Semantic Annotation, Ontology Building, and Interactive Key Generation from Morphological Descriptions. TDWG Annual Meeting 2012, Beijing, Oct 22-26, 2012.
Huang, F.Q.,Macklin, J, Morris, P., Sanyal, P.P., Morris, R.A. , Cole, H. & Cui, H (2012) OTO: Ontology Term Organizer. Annual Conference of American Society for Information Science and Technology. [poster]
Arighi, C.N., Carterette, B., Cohen K.B. et al. (In Press). An Overview of the BioCreative 2012 Workshop Track III: Interactive Text Mining Task. Database.
Walls, R., Cui, H., Macklin J., Mungall, C., Cooper, L., Stevenson, D.W., & Jaiswal, P.(2012) Mapping of glossary terms from the Flora of North America to the Plant Ontology enhances both resources. Proceedings of the 3rd International Conference on Biomedical Ontology, June 2012 Austria.
Cui, H., Balhoff J., Dahdul W., Lapp H., Mabee P., Vision T., & Chang, Z. (2012). PCS for Phylogenetic Systematic Literature Curation. Proceedings of the BioCreative workshop 2012 (pp.137-144). [full paper]
Chang Z., Balhoff J., Dahdul W., Lapp H., Mabee P., Vision T., & Cui, H. (2012). Workflow of CharaParser and Phenex: Turning character descriptions to EQ statements. BioCreative workshop 2012, Washington, D.C. [poster and software demo]
Thessen, A., Cui, H., & Mozzherin Dmitry. (2012). Applications of Natural Language Processing in Biodiversity Science. Advances in Bioinformatics.Volume 2012. doi:10.1155/2012/391574. http://www.hindawi.com/journals/abi/2012/391574/
Janning, A. & Cui, H. (2012). Evaluating the botanical coverage of PATO using an unsupervised learning algorithm. iConference 2012, Feb 7-10, 2012, Toronta, Canada.
Cui, H. (2012). CharaParser for fine-grained semantic annotation of organism morphological descriptions. Journal of American Society of Information Science and Technology. 63(4) DOI: 10.1002/asi.22618 http://onlinelibrary.wiley.com/doi/10.1002/asi.22618/pdf
Cui, H.,Singaram, S., & Janning, A. (2011). Combine Unsupervised Learning and Heuristic Rules to Annotate Morphological Characters. 2011 Annual Meeting of American Society of Information Science and Technology, New Orleans, Oct 9-12, 2011.
Macklin, J., Cui, H., Morris, R., Morris, P. (2011). Floras in the 21st Century: The Flora of North America. In Creating Next Generation Floras Symposium, International Botanical Congress, Australia, July 22-30 2011 [abstract+oral presentation]
Cui, H. (2011). Fine-Grained Semantic Markup of Descriptive Data. Panelist in Informatics Tools for the SEmantic Enhancement of Taxonomic Literature Symposium, XVIII International Botanical Congress, 2011. Melbbourn, Australia, July 23-30. [Abstract and oral presentation].
Cui, H., Duan, Y. & Li, F. (2011). Machine learning based semantic markup of biodiversity literature in English. Document, Information, & Knowledge, 2, [In Chinese. The paper is a review of semantic markup research conducted by Cui and collaborators. The review was written by Duan and his student Li].
Cui, H., Jiang, Y, & Sanyal P.P. (2010). From Text to RDF Triple Store: An Application for Biodiversity Literature[Demo]. Proceedings of the 73rd ASIS&T Annual Meeting v. 47. Oct 22-27, 2010. Pittsburg, PA.http://www.asis.org/asist2010/proceedings/proceedings/ASIST_AM10/openpage.html
Cui, H. (2010). Unsupervised Extraction of Text Segments from Heterogeneous Document Collections[Poster]. Proceedings of the 73rd ASIS&T Annual Meeting v. 47. Oct 22-27, 2010. Pittsburg, PA.http://www.asis.org/asist2010/proceedings/proceedings/ASIST_AM10/openpage.html
Cui, H., Sanyal P.P., & Yu C. (2010). Tools for Semantic Annotation of Taxonomic Descriptions. In R. Setchi et al. (Eds.): Proceedings of 14th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems., Part IV, Lecture Notes in Artificial Intelligence 6279, pp. 506--516. Springer Heidelberg. Sept 8-10, 2010, Cardiff, Wales, UK.
Cui, H. (2010). Linking Corpus Characteristics to Performance of Semantic Annotation Systems for Biosystematic Descriptions. Proceedings of the 2nd International Conference On Bioinformatics and Biomedical Technology. Chengdu, China. April 16-18, 2010.
Cui, H. (2010). Competency Evaluation of Plant Character Ontologies Against Domain Literature. Journal of American Society of Information Science and Technology. 61(6):1144-1165. http://www3.interscience.wiley.com/cgi-bin/fulltext/123319711/PDFSTART
Cui, H., Boufford, D., & Selden, P. (2010). Semantic Annotation of Biosystematics Literature without Training Examples. Journal of American Society of Information Science and Technology. 61 (3): 522-542.http://onlinelibrary.wiley.com/doi/10.1002/asi.21246/full
Cui, H. (2010). Semantic Annotation of Morphological Descriptions: An Overall Strategy. BMC Bioinformatics.11:278. DOI:10.1186/1471-2105-11-278.http://www.biomedcentral.com/1471-2105/11/278
Cui, H. et al. (2009). "Fine-Grained Semantic Annotation of Descriptive Data for Knowledge Application in Biodiversity". TDWG 09. France.
Cui, H., Yu C., & McKline, J. (2009). Application of Semantic Annotation for Quality Insurance in Biosystematics Publishing. Proceedings of the Annual Meeting of American Society of Information Science and Technology 2009. Normal 0 false false false EN-US ZH-CN X-NONE
Cui, H (2008). Converting Taxonomic Descriptions to New Digital Formats Biodiversity Informatics. 2008. 20-40.
Cui, H. (2008). Approaches to Semantic Mark up for Natural Heritage Literature. Proceedings of the iConference 2008. . http://www.ischools.org/oc/conference08/pc/PA5-2_iconf08.doc
Cui, H. (2008). Unsupervised Learning for Semantic Markup of Biodiversity Literature. Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries, (pp. 25-28).
Cui, H. (2008). An Application for Semantic Markup of Biodiversity Documents (System Demonstration). Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries, (pp.421)
Cui, H., Sai, D., & Tang, X. (2007). Chapter 3: Information Representation. In Heting Chu & Yin Zhang. (Eds.), Research Fronts in Library and Information Science in the West. Beijing: Renmin University Press. (Series on Research Fronts in the Humanities and Social Sciences in the West).
Cui, H. & Heidorn, P.B. (2007). The reusability of induced knowledge for the automatic semantic markup of taxonomic descriptions. Journal of the American Society for Information Science and Technology. 58(1), 133-149.http://www3.interscience.wiley.com/cgi-bin/fulltext/113466052/PDFSTART
Cui, H. & Nickerson, G. (2007). Use Server2Go to Teach IT Courses for LIS Students. Journal of Association for Library and Information Science Education. 48 (4). 261-271.
McCourt, R.M., Cui, H., Guiry, M. & Feist, M. (2006) Using Machine Learning Environments to Extract Taxonomic Information from Text: An example from print and online texts on algae (Poster). Phycological Society of America Annual Meeting 2006. July 7-12, Juneau, Alaska, USA.
Hirokawa, S. & Cui, H. (2006) Automatic Generation of Hierarchy for Plant Identification Terminology (System Demo). The 2nd International Digital Curation Conference. Nov. 21-22, Glasgow, UK.
Cui, H. (2006). Automatic markup of morphological descriptions. Workshop on Refactoring Natural History Literature: Use and Reuse of Natural History Collections. April 17-18, 2006. GSLIS/University of Illinois at Urbana Champaign.
Cui, H., McCourt, R. M., & Feist, M. (2006). Unsupervised Structure Discovery for Biodiversity Information. (System Demonstration) Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries. June11-15, Chapel Hill, NC, USA. 384.
Cui, H., McCourt, R. M., & Feist, M. (2006). Automated Concept Discovery in Corpora of Morphological Descriptions. Proceedings of the Annual Meeting of American Society for Information and Technology(CD-ROM). November 3-8, Austin, Texas, USA.
Cui, H. (2005). MARTT: Automatic markup of taxonomic descriptions with XML. CAIS 2005. Jun 2-4, 2005. London, Ontario.
Cui, H. (2005). MARTT: Using knowledge based approach to automatically mark up plant taxonomic descriptions with XML. Proceedings of the Annual Meeting of American Society for Information and Technology. Oct 28-Nov 2. 2005 Charlotte, North Carolina, USA.
Cui, H. (2005). A machine learning environment for automatic markup of taxonomic descriptions with XML. Taxonomic Database Working Group 2005 Annual Meeting (ISBN:3-921800579). 16
Cui, H. (2004). Knowledge-based semantic markup of plant descriptions. DOCSIG Poster, ALISE 2004.
Cui, H., Heidorn, P.B., & Zhang, H. (2002). An approach to automatic classification for information retrieval. Proceedings of the Joint Conference of Digital Libraries 2002, 96-97.
Heidorn, P.B., Cui, H., Yu, B. Wu, J., & Zhang, H. (2002). Taxonomic description creation, search and display in XML. Abstract. Botany 2002.
Cui, H. (2002). Automatic/semi-automatic parse of Flora of North America records into XML format using machine-learning techniques. DOCSIG Poster, ALISE 2002.
Heidorn, P.B. & Cui, H. (2000). The interaction of result set display dimensionality and cognitive factors in information retrieval systems. Proceedings of the Annual Meeting of the American Society for Information Science, ASIS 2000, 258-270.
Cui, H. (1998). A clustering analysis on citation motivations of Chinese scientists. Journal of Information Science (Chinese), 17(2). 68-70. Cui, H. (1998). Analyzing self-citations in Chinese scientists. Information: Theory and Application (Chinese), 21(3). 153-154,176.
Software/Database Created
CharaParser, the unsupervised semantic parser, has extracted terms from textual morphological descriptions. These terms are submitted to ontologies: https://sites.google.com/site/biosemanticsproject/ontology-discussions/candidate-ontology-terms
Unsupervised Semantic Markup System for Organisms Descriptions. (Video Demo: part 1(5min):http://screencast.com/t/OTdlZDNm and part 2 (5min):http://screencast.com/t/ODVmZmZjM)
Supervised Semantic Markup System for Organisms Descriptions (MARTT). Demos and downloadable application:http://sites.google.com/site/biosemanticsproject/project-progress-wiki
GreenStone Digital Library Collections for Flora of China, Flora of North America, Flora of North Central Texas. (URL:http://research.sbs.arizona.edu/gs/cgi-bin/library)
MARTT: Semantic Markup System for Taxonomic Treatments. (Demo URL:http://research.sbs.arizona.edu/~hongc/ResearchDemo/MARTTDemo1.html)
Collection of instructional movies made for IRLS 515: (URL:http://research.sbs.arizona.edu/~hongc/515movies/)
Collection of instructional movies made for IRLS 630: (URL:http://research.sbs.arizona.edu/~hongc/630/)
Invited Presentations
Cui, H. (2013) CharaParser and associated software. pro-iBiosphere workshop. Leiden, the Netherlands. Feb 12-14, 2013.
Cui, H. (2013) Parsing morphological descriptions to support semantic-based access. pro-iBiosphere workshop. Leiden, the Netherlands. Feb 12-14, 2013.
Cui, H. (2013) Markup tools: CharaParser. pro-iBiosphere workshop. Leiden, the Netherlands. Feb 12-14, 2013.
Cui, H. (2011) "Fine-Grained Semantic Annotation of Taxonomic Descriptions: Progress and Challenges" at the Florida State University
Cui, H.(2011) "Text Mining in Taxonomic Literature" Presented to students of Natural History Museum, Bogota Colombia. Aug 10, 2011.
Cui, H. (2011) "Fine-Grained Semantic Annotation of Taxonomic Descriptions: Progress and Challenges" at the Marine Biology Laboratory, Woods Hole, Boston.
Cui, H. (2011). Fine-Grained Semantic Markup of Descriptive Data. XVIII International Botanical Congress, Melbourn, Australia, July 2011
Invited to present for the IRLS504 Summer 2010 classes, Jun 4, and July 31, 2010.
Invited to present at School of Management Information Systems, East China Normal University, Shanghai, China. May 12, 2010.
Invited to present at Department of Medical Information Management, Tongji Medical College, Huazhong Science and Technology University, Wuhan, China. April 20, 2010.
Invited to present for the IRLS504 Spring 2010 class, Jan 5, 2010.
Invited to present for CSC296H/496H Research Topics in Computer Science (Honors Students Seminor), Nov 4, 2009.
Invited to present for the IRLS504 Summer 09 class, Jun 28, 2009.
Invited to present at LING 696G Computational Linguistics Seminar (Instructor: Sandiway Fong), Sept 24, 2008.
Invited to present at the Cognitive Science Colloquium, University of Arizona, Sept 12, 2008
Invited to present at the Cognitive Science Graduate Seminar, University of Arizona, Sept 11, 2008.
Invited to present the unsupervised semantic markup system for biodiversity literature at the Paleontology Institute, University of Kansas, May 11-14, 2008
Invited to present the MARTT semantic markup system for biodiversity literature at the Harvard University Herbaria. Apr 21-24, 2008
Invited to present for the doctoral seminar LIS590IRR: Information Retrieval and Natural Language Processing of Prof. Heidorn the Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign. Nov 29, 2006
Invited to present at a meeting of Joint Natural Language Processing Study Group of the Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign and the University of North Carolina. Oct 19, 2005.
-->
Courses Taught:
Dr. Cui's teaching is in the area of information organization and knowledge representation. She regularly teaches Organization of Information and Controlled Vocabularies courses. Her courses received high scores from students and she has won three awards from SIRLS students for her excellence at teaching and mentoring in 2009 and 2010.
At U of A: Information Organization: IRLS 515: Organization of Information, IRLS630: Controlled Vocabularies, IRLS 588: XML and Semantic Web Standards.
At other institutions: Web Design, Relational Database and Management, Managing Internate Information Services, Information Retrieval.
Projects:
Co-PI: "Collaborative Research: ABI DEVELOPMENT: Using innovative NLP algorithms as a process to populate a life-wide, knowledge-base that will make small science biodiversity data ready for data-intensive research" Submitted to NSF ABI, 2011, Declined
PI: "BCSP:Collaborative Research: ABI Development: Exploring Taxon Concepts (ETC.) through analyzing fine-grained semantic markup of descriptive literature” Submitted to NSF ABI, 2011. Funded. UA Budget: $1M.
PI, “Collaborative Research: Next Generation Phenomics for the Tree of Life,” NSF, $335,000. Awarded (DEB-1208567). 5/2012-4/2015.
Subcontract PI: subcontract to "Collaborative research: ABI Development: Ontology-enabled reasoning across phenotypes from evolution and model organisms" (NSF DBI:1062542), July 2011-June, 2013. $50,000
PI, "Parsing FNA Volumes and Enhancing FNA Character Search Tool," the Flora of North America (FNA) Project, $ 9818.01, Awarded. 5/2012-8/2012.
PI: "Enabling Search for Characters for the Flora of North America II", funded by the FNA Project, May 2010-Dec 2010. $10,000. More Info
PI: "Fine-Grained Semantic Markup of Descriptive Data for Knowledge Applications in Biodiversity Domains", funded by the Advances in Biology Informatics & Emerging Frontiers programs, National Science Foundation (NSF), United States. Aug 2009-July 2012. $700,000. More Info
PI: "Enabling Search for Characters for the Flora of North America", funded by the FNA Project, Dec 2009-May 2010. $10,000. More Info
PI: "Automated Semantic Markup of Flora of North America for Enhanced Access", funded by the FNA Project. May 2008-Sept 2008, $7,000
PI: " The Value of Automated Semantic Annotation for Biodiversity Informatics", funded by the Natural Sciences and Engineering Research Council (NSERC) Canada, Discovery Fund
Jan 2005-2011 UWO Start-up Fund
Jun 2005-2007 FIMS, UWO Internal Research Fund: The Structuredness of Text C
Advising
One Ph.D student of University of Arizona, 2010-
The Progressive Librarians Guild SIRLS chapter.
One Ph.D student (active) of University of Western Ontario on Evidence-based Quality Evaluation of Health Care Websites using computational linguistics and machine learning methods. 2007-
Around 80 Master's students of SIRLS, University of Arizona. 2007-2009
A Master's student of SIRLS on her internship at Tucson Girls Scout Library on setting up an Integrated Library System for the library. 2008-2009
Professional Services (not up to date)
Reviewed an "Individual Discovery" proposal for Natural Sciences and Engineering Rearch Council Canada, Dec 2010
Science Fair Judge for Wilson K-8, March 2010.
Reviewed a paper for Biodiversity Informatics, May 2010.
Organized a panel presentation on biodiversity informatics at Biological Sciences East for the U of A community in April 2010. The panel members came from 4 US and international research institutes.
Officer of Arizona Chapter of American Society of Information Science and Technology. Revitilizing the Chapter in 2010.
Reviewed a book manuscript on taxonomies for Neal-Schuman Publishers, , Feb 2010.
Review panel for an NSF bioinformatics program, Jan 2010.
Reviewed 4 full papers and 3 posters for iConference 2010.
Member of IT Committee of Flora of North America Project, 2009-
Reviewer for Neal-Schuman Publishers, reviewed a book proposal on controlled vocabularies, in 2009.
Member, Conference Planning Committee, iConference 2009. Involved in creating the Call for Participation and organizing the Junior Faculty Mentoring session.
Reviewer for Natural Sciences and Engineering Rearch Council Canada, Journal of ASIST, Biodiversity Informatics, and iConference 2008.
Award winning ASIST SIGIII officer: Award winning SIGIII Website master 2005-2007
Areas of Study:
Library and Information Science




