Sunday, May 21, 2017

Shopper's guide at store entrance

Unless you go to a store where sales people wait to assist you, a shopper's guide, either a paper map or a computer, may be helpful to guide you to the exact isle and shelf. This is particularly useful for a big store and the products are inherently complicated for selection, such as Home Depot, where there is always a shortage of salesmen and you often hear "Customer assistance is needed at ..." on the speakers. A simple computer that only informs of the isle and shelf for a type of products is helpful, but can be further improved. A sophiscated software program should be able to take voice in addition to keyboard input and answer questions as much as it can, not just "Where do I find air conditioner filters?", but also "Do you carry Pfister faucet side sprayer model 951‑026U?", or even "Does SKU LK1CS fit American Standard model ...?" No software is a replacement for a human. But such programs, which software programmers can develop with relative ease, can solve the problem of not enough in-store employees and, in limited cases, help customers with better accuracy and higher efficiency than humans.

Sunday, October 2, 2016

Optimal age for profession

James Gosling, the creator of the Java programming language, wrote on Facebook that "In the time between Sun and LRI I’d get lines like 'We normally don’t hire people your age, but in your case we might make an exception'". Even a world-class computer programmer is not immune from implicit age discrimination during a job interview. (Or should we call it natural selection in a modern society?) For a long time, I've been thinking of the optimal ages for various professions. They are not necessarily the same as the actual mean ages of the workers in their respective occupations, even though for lack of data, the latter can be substituted as an approximation. For example, the 2015 Labor Department statistics shows that "computer and mathematical occupations" have a mean age of 40.8, and "architecture and engineering occupations" 43.6. But as we all know, IT professionals are much less popular after age 40, or even earlier, while most engineers continue to enjoy their seniority well into 50, simply because the engineering technology is not evolving as fast as computer technology and accumulated personal experience matters.

"Optimal" is in the eyes of the beholder. As a result, no objective measurement may be constructed. Instead, a survey among a large number of employers is needed. As of today, I know of no such data available. But I can imagine that athletes take the youngest extreme, doctors probably take the other, and IT professionals are not too far away from athletes. Each category can be further divided. Shooting athletes don't have to be as young as runners. Database administration is a job well sought after by ageing programmers, for job security as well as a better pay. In China, traditional Chinese doctors (TCM practitioners) will undoubtedly be sitting at the oldest extreme of this age-popularity axis, surpassing non-TCM doctors (called xīyī in Chinese). In all countries, old glass blowers are dearly loved grandpas, who would kindly reject the job offer for a happy retirement life, at least before the time 3-D printing catches up with human glass blowing.

Software or Web site for ancient character recognition

As of today, there is no software that can recognize ancient characters, such as ancient Chinese characters in seal script or oracle bone script, ancient Egyptian hieroglyphs, Maya script, etc. These characters or scripts will not be encoded by Unicode, and so will remain as images. A human has to read and interpret them. This takes years of practice and is error-prone. Since OCR (optical character recognition) or even facial recognition has become commonplace, ancient character recognition can be made available without too much modification of existing technology. Once that's done, a Web site can be set up for the convenience of scholars and hobbyists alike.

Associated with this technology, the software that can recognize ancient characters can be extended to generate "Levenshtein distance"-like metric or index. There is at least one use case with this metric. The scholars studying ancient Chinese characters know that before Qin Shi Huang (literally, first emperor of Qin) united China, the writing styles of the same character differed in different states. The researchers judge the similarity of the styles and group the states accordingly. But this human judgement is inevitably arbitrary and varies from person to person. Software-assisted similarity judgement will be a great step toward standardization. Levenshtein distance works on words composed of letters. But the concept can be extended to glyphs if a computer scientist can cooperate in this research.

Monday, September 19, 2016

Web site to collect dominant opinions of a specific language group on a specific incident

Suppose you want to know how most of Germans, Russians, Japanese, Spanish-speaking people, English-speaking people respond to a specific major world incident, such as Russian hackers' break-in of WADA (World Anti-Doping Agency) database. You can guess that many Americans condemn the hacking while Russians cry for hypocrisy. To confirm that, the traditional method is to conduct a poll or survey. But that is time-consuming, somewhat intrusive, and not very efficient. Alternatively, you can go to major Web sites such as cnn.com, abc.com to check (mostly) American readers' comments below this particular news, and do the same on major Russian news media Web sites, and then on the Web sites mostly visited by Germans, Japanese, for opinions of other nationals. If the news on those Web sites has gathered enough readers' comments and the comments can be like'd by other readers, you can simply read the top comment or comments to know the popular opinion of the specific nationals. No method is perfect. But this approach is fast and easy. One single person can do this job within a matter of ten minutes, if he has a good reading capabilty of different languages.

As stated, this process is manual and probably tedious. Fortunately, it is not technically difficult to have it automated. Someone can build one single Web site where the reader can search for the news reports of the same incident on Web sites of different languages. Then the reader can check the most like'ed comments posted by people speaking those languages. This Web site should automatically translate, by way of Google Translate or any online translation engine, the entered keywords and submit them to representative Web sites, such as cnn.com, spiegel.de, elmundo.es, etc. In addition, the Web site should also gather such information from Facebook, where major news medias frequently provide news feeds and readers post comments that by default are already sorted by number of like's.

There are shortcomings in this opinion gathering method and the automation Web site. Although sampling bias is not unique to any specific polling method, it may be particularly evident in this passive sampling. But more importantly, while machine translation as of 2017 can do a good job with well written articles, it struggles with casual writing with spelling or grammatical errors, which a human can easily tolerate. Readers' comments on social networks have so many such errors that a human translator may have to step in to decipher what the passages exactly mean. If necessary, a group of volunteers from different countries have to work on such a Web site. Lastly, the people speaking a specific language are not necessarily of a specific nationality. But that's a minor point.

Remotely controlled gun to protect premise

Among many ways to prevent burglary, one has not been implemented: remotely controlled gun. The remote control is equipped with a surveillance camera and the gun trigger is pulled through necessary solenoid mechanical-electrical mechanisms. Although no company makes this product at this time, all technology involved in this combination of surveillance with a remotely controlled gun is available.

Soccer goalkeeper should move randomly

Soccer goalkeeper or goalie should move in an unpredictable way left and right during the penalty, instead of staying in a fixed position as generally practiced. The random movement of the goalkeeper not only interferes with the focus of the penalty shooter, but more importantly, disrupts any of his plans to shoot at a predetermined position of the goal. As to the goalkeeper, most of the time his catch of the ball is by accident anyway. So there's nothing for him to lose.

Wednesday, July 29, 2015

National Days Concentrate in Summer and Early Fall

This is not about a new idea in the sense of potential invention or patent. But a new and interesting finding.

There's an uneven seasonal distribution of the national days of the countries in the world. More national days fall in summer and early fall. Using data from Wikepedia, I summarize the months in which the countries observe their national days as follows (excluding special ones such as Israel, whose national day can vary between April 15 and May 15). The first column is month, and the second column is the count of countries whose national days fall in that month:

April 7
January 8
March 8
June 11
February 13
December 13
May 14
November 16
October 17
September 25
July 25
August 26

The Wikipedia data is not complete; it doesn't even have the day for Australia, among others. Using a more complete data set, from OfficeHolidays.com, I get a slightly different result:

Jan 5
Apr 12
Mar 13
Feb 15
May 16
Dec 16
Jun 18
Oct 19
Nov 20
Aug 24
Sep 26
Jul 33

Fig.1 National days distribution on months of year

Let's assume that the national day is when the country initially announced independence. As you can see, most countries announced their independence in July, August and September. With more than 200 data points, this cannot be completely accidental. There should be a good reason for it.

One hypothesis is that an independence war is more active in warm weather. In winter, the harsh cold weather slows down the progress, or rather, slows down the offence more than the defence, as the relatively inactive defending side is not impacted as much by external factors as the active attacking side is. There's a small surge in February. It may be due to the fact that at the end and beginning of a year, people are more or less in a holiday mood. After that, the independence activity resumes, with a burst of pent-up energy.

To test the hypothesis that low temperature in winter reduces the chance of independence, or rather, reduces the success rate of offense more than reduces the rate of defense, I come up with one idea to test it: Plot the national days according to the geographic latitude of the countries. If a country is in the southern hemisphere, their winter is actually summer in the north. If the hypothesis is correct, we should see a reverse trend for those countries: their independence days will largely concentrate in (northern) winter months. In addition, the countries near the equator will have less correlation than those in higher latitude, because the seasonale change at lower latitudes is not as obvious.

There's no "general" or "average" latitude of a country available. So I use the latitude of the capital of the country as an approximation, a method justified by the fact that a capital is usually one of the most contended cities in an independence war. The data thus combined is in Appendix. A plot of the data is Fig.2, where the y-aixs is the latitude for the capitals of the countries.


Fig.2 National days distribution on months and geographic latitude

As we can see, there's no clear correlation between latitude and independence days. Both northern and southern hemisphere countries have more independence days in summer and early fall. There must be some factors that contribute to the seasonal distribution of national or independence days of the world countries. This research will continue and preliminary data and results will be given on this web page.

Appendix

NationalDayLatitudeData.txt
National Days are from OfficeHolidays, and Capital latitude data are from Wikipedia. The data in the datafile is tab-delimited and can be copied into Excel for analysis and plotting.

Originally posted to my web site.

Forum discussion

historum.com, general-history

Charge job application fees to stop spamming

In the current fierce competition for limited job openings, some job applicants resort to resume spamming in the hope of getting one or two...