Chinese Zero to Hero
Settings are automatically saved as soon as you make the change.

Settings

神奇的丝瓜

《标准教程 HSK 6》第18课课文

春天,孩子们在楼旁空地上开出一个小小的花园,随即种上了一棵树、几株花和几粒丝瓜种子。土壤不是很肥沃,但有水的滋润,阳光的照耀,没几天,丝瓜就从土里冒了出来,接着我惊讶地发现,它好像每时每刻都在长大。看着丝瓜,我心中难免不解:古人是怎么想的,愣是编出个拔苗助长的故事来?要是我,宁愿用别的比喻。

Translation text is shown. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Text Corpus Settings

A text corpus is a large collection of text written in a language, where we can extract collocations and example sentences. Sketch Engine , our text corpora provider, has a number of Chinese text corpora to select from. Depending on the corpus you choose, the example sentences and collocations you see will be different.

Corpus Code Language Words Note
Chinese Web 2017 (zhTenTen17) Simplified zhtenten17_simplified_stf2 Chinese Simplified 13,531,331,169 Chinese web corpus. Simplified script texts. Downloaded by SpiderLing in Aug & Nov 2017. Cleaned, deduplicated, foreign language filtered. Tagged by Stanford Core NLP Tools (pipeline v2).
  • Featured.
  • Parallel. That means L1 translation is available.
  • Web.
Chinese Web 2011 (zhTenTen11, Stanford tagger) zhtenten_lenoch Chinese Simplified 1,729,867,455 Chinese (mainland + traditional, mostly mainland) web corpus crawled in 2011. Tagged by Stanford Log-linear Part-Of-Speech Tagger using the Chinese Penn Treebank standard models. Word sketch grammar by Ondrej Svoboda
  • Parallel. That means L1 translation is available.
  • Web.
Chinese Simplified Web 2017 sample zhtenten17_simplified_stf2_term_ref Chinese Simplified 250,361,047 Chinese web corpus. Simplified script texts. Downloaded by SpiderLing in Aug & Nov 2017. Cleaned, deduplicated, foreign language filtered. Tagged by Stanford Core NLP Tools (pipeline v2). Sample for term extraction.
  • Parallel. That means L1 translation is available.
OPUS2 Chinese Simplified opus2_zh Chinese Simplified 243,427,123 Chinese Simplified corpus of OPUS2 (open source parallel corpus). Encoded in UTF-8, tagged with Chinese Penn Treebank v2 tagset. OPUS2 collection contains 40 languages.
  • Parallel. That means L1 translation is available.
Chinese GigaWord 2 Corpus: Mainland, simplified cgw2_sc Chinese Simplified 205,031,379 Chinese Simplified Gigaword 2 corpus of newswire created in 2005. Encoded in UTF-8. Tagged with Chinese GigaWord tagset.
  • Parallel. That means L1 translation is available.
Chinese Web (Internet-ZH, NEUCSP tagger) i_zh Chinese Simplified 198,205,344 Chinese web corpus collected by Serge Sharoff. Encoded in UTF-8, tokenised and part-of-speech tagged using tools from Northeastern University in China.
  • Parallel. That means L1 translation is available.
  • Web.
Chinese Web 2011 (zhTenTen11, sample 10M) zhtenten_10M Chinese Simplified 9,012,125 Sample of Chinese (mainland + traditional, mostly mainland) web corpus crawled in 2011. Tagged by Stanford Log-linear Part-Of-Speech Tagger using the Chinese Penn Treebank standard models. Sketch grammar prepared by Simon Smith.
  • Parallel. That means L1 translation is available.
  • Web.