Monday, September 19, 2016

Web site to collect dominant opinions of a specific language group on a specific incident

Suppose you want to know how most of Germans, Russians, Japanese, Spanish-speaking people, English-speaking people respond to a specific major world incident, such as Russian hackers' break-in of WADA (World Anti-Doping Agency) database. You can guess that many Americans condemn the hacking while Russians cry for hypocrisy. To confirm that, the traditional method is to conduct a poll or survey. But that is time-consuming, somewhat intrusive, and not very efficient. Alternatively, you can go to major Web sites such as cnn.com, abc.com to check (mostly) American readers' comments below this particular news, and do the same on major Russian news media Web sites, and then on the Web sites mostly visited by Germans, Japanese, for opinions of other nationals. If the news on those Web sites has gathered enough readers' comments and the comments can be like'd by other readers, you can simply read the top comment or comments to know the popular opinion of the specific nationals. No method is perfect. But this approach is fast and easy. One single person can do this job within a matter of ten minutes, if he has a good reading capabilty of different languages.

As stated, this process is manual and probably tedious. Fortunately, it is not technically difficult to have it automated. Someone can build one single Web site where the reader can search for the news reports of the same incident on Web sites of different languages. Then the reader can check the most like'ed comments posted by people speaking those languages. This Web site should automatically translate, by way of Google Translate or any online translation engine, the entered keywords and submit them to representative Web sites, such as cnn.com, spiegel.de, elmundo.es, etc. In addition, the Web site should also gather such information from Facebook, where major news medias frequently provide news feeds and readers post comments that by default are already sorted by number of like's.

There are shortcomings in this opinion gathering method and the automation Web site. Although sampling bias is not unique to any specific polling method, it may be particularly evident in this passive sampling. But more importantly, while machine translation as of 2017 can do a good job with well written articles, it struggles with casual writing with spelling or grammatical errors, which a human can easily tolerate. Readers' comments on social networks have so many such errors that a human translator may have to step in to decipher what the passages exactly mean. If necessary, a group of volunteers from different countries have to work on such a Web site. Lastly, the people speaking a specific language are not necessarily of a specific nationality. But that's a minor point.

No comments:

Post a Comment

Charge job application fees to stop spamming

In the current fierce competition for limited job openings, some job applicants resort to resume spamming in the hope of getting one or two...