Skip to Main Content

Chinese

Search Strategy

Many people find it very hard to search for items in other languages. This is due to a mix of technological and cataloging realities that most people do not realize are at play. 

  1. Cataloging is a process that assigned specific metadata to items like books, images, articles, art work, etc so that these items can be found by other people. Cataloging is labor intensive work that requires highly trained professionals.  There is also a lot of subjectivity in cataloging. For example, one cataloger may assign the subject of "Military art and science " to The Art of War, while another might classify it as "Leadership". This seemingly minor difference has major impacts on how the information is organized, searched and found by readers. 
  2. Cataloging items in various languages is very difficult because the cataloger has to balance the needs of the institution and the needs of searchers with the information that is available to them about the item. For example, one cataloger may label the works of Confucius with the creator term "Confucius". However, another cataloger may use "Kǒngzǐ" or even "孔子". Other issues deal with translation mistake, like the decision to use a direct translation of a title rather than the popular translation of a title.
  3. Catalogers make mistakes, especially with languages that they are not fluent in. Many catalogers need to catalog items in dozens of languages, so it would be impossible to learn all of them. Furthermore, many catalogers have to transliterate information, which is the process of taking a non-Latin script and translating it to a Latin script, which is nessecary for searching for item in databases and catalogs. 
  4. Databases are built around searching this metadata and are built around a specific language. Searching outside of this language will not often yield good results unless the results have been cataloged in multiple languages. 
  5. Database often cannot translate searches. Often if you search for something in English, you will often get English results

 

One of the best ways to search for what you want is to tell the database the multiple ways in which the item information could appear by using the OR Boolean operator. 

The "OR" operator allows you to tell the database to look for items that carry one or all of the terms listed, rather than just one term. For example, searching "global warming" only would bring up only articles that use that exact term. This leaves out several results that cover the same concept but uses a different term to describe it. By searching "global warming OR climate change", the database will find a lot more results. 

This is a very powerful tool as it can help tell the database that the items we are looking for may be in different languages with different metadata. For example: 

 Xunzi OR Master Xun OR Xun Kuang OR   荀子

would yield more results from more language sources than search "Xunzi" alone. 

Finding translated terms is not always easy because translation is not an exact science. A great way to find a lot of translations of a concept is using Wikipedia and Wikidata as shown below. Once you have the terms, use the "OR" operation with the translations and your original term to find items. 

 

Transliteration is a type of conversion of a text from one script to another that involves swapping letters in predictable ways. Until very recently, it was not possible to preform searches with multiple scripts, and many databases still struggle with this. 

Many records still use Transliteration or Romanization versions of names and titles. Essentially, this process matches up sounds from the item's language with letters in the target language. Romanization tables standardize this process. 

 

 

Every database is a little different, which means that searching them effectively may be different from what you are used to. Fortunately, many databases have pages that walk you through how to use them. 

Databases

Below are links to different primary source collections. Use the "Database" tab to the left to find databases for secondary sources. 

When searching for images, keep in mind that computers do not see images as we do and what you are often searching is not the content of the image itself but the metadata or descriptors that someone else has assigned to it. 

Best Practices:  

1. Play around with search terms when searching for specific subjects (i.e. "Cat") of images because subjects are highly subjective and the person who added the metadata may have used a different term than you would have. (For example, they may have use "Feline" or "Tuxedo".)

2. Use general places or time period names. Searching for art from with the terms "China AND Qin" will bring up results 

3. Pay attention to copyright and properly cite images. Do not repost images that are copyrighted and be sure to cite the creator, title, and the institution the provided the image. 

Physical Collection