PyCantonese comes with one built-in corpus, the Hong Kong Cantonese Corpus. For corpora other than HKCanCor, PyCantonese provides the function read_chat () to read in Cantonese data in the CHAT format. Someone with more skills than me could try to read 裏 through this python search from other corpuses and see what is the result.
KIII-TV: Corpus Christi police arrest former school volunteer on child pornography charge
KRIS-TV: Corpus Christi police arrest man suspected of targeting women in multiple southside apartment break-ins
Corpus Christi police arrested a 25-year-old man suspected of targeting single women in a series of southside apartment break-ins.
Corpus Christi police arrest man suspected of targeting women in multiple southside apartment break-ins
Corpus Christi Caller-Times on MSN: Arrest made in Mimosa Drive murder, Corpus Christi police say
The Corpus Christi Police Department announced on April 20 that an arrest has been made in relation to an April 18 murder on the Westside.
Corpus Christi Caller-Times on MSN: Corpus Christi police arrest man in April 15 killing of teen on Weber
The Corpus Christi Police Department shared an arrest update for the April 15 fatal shooting near Weber Road and Tripoli Drive on April 22.
Corpus Christi police arrest man in April 15 killing of teen on Weber
A 25-year-old man was arrested early Sunday morning after police observed him attempting to break into an apartment, according to the Corpus Christi Police Department.
I would read in the BCC corpus frequency list as a dictionary, then Having concatenated all the news/magazine articles as plain text, I would build a dictionary of all the words in the news/magazine articles up to 8 characters long, counting their number of occurrences with the help of the BCC frequency list (which tells us which combinations ...