Back home   |   Bookmark   |   Start page   |   Site map    
Services
News
Channels
Home & Family
Leisure
Technology
Business
Science
Site Search
Free email




Algorithm for learning languages

TheallIneed.com/NC&T/CU
The development -- which has a patent pending -- has implications for speech recognition and for other applications in natural language engineering, as well as for genomics and proteomics. It also offers new insights into language acquisition and psycholinguistics. "The algorithm -- the computational method -- for language learning and processing that we have developed can take a body of text, abstract from it a collection of recurring patterns or rules and then generate new material," explained Shimon Edelman, a computer scientist who is a professor of psychology at Cornell and co-author of a new paper, "Unsupervised Learning of Natural Languages," published in the Proceedings of the National Academy of Sciences (PNAS, Vol. 102, No. 33).

"This is the first time an unsupervised algorithm is shown capable of learning complex syntax, generating grammatical new sentences and proving useful in other fields that call for structure discovery from raw data, such as bioinformatics," he said.

Unlike previous attempts at developing computer algorithms for language learning, the new method, called Automatic Distillation of Structure (ADIOS), successfully identifies complex patterns in raw texts. The algorithm discovers the patterns by repeatedly aligning sentences and looking for overlapping parts.

For example, the sentences I would like to book a first-class flight to Chicago, I want to book a first-class flight to Boston and Book a first-class flight for me, please may give rise to the pattern book a first-class flight -- if this candidate pattern passes the novel statistical significance test that is the core of the algorithm.

If the system also encounters the sentences I need to book a direct flight from New York to Tel Aviv and I would like to book an economy flight, it may infer that the phrases first-class, direct and economy are equivalent in the context of the new pattern. "Because such equivalence sets can contain other patterns -- in turn containing further patterns, and so on -- the resulting body of knowledge grows recursively, as a sort of forest of branching trees of possibilities," said Edelman.

Shimon Edelman. (Photo: Cornell University)
He added, "ADIOS relies on a statistical method for pattern extraction and on structured generalization -- two processes that have been implicated in language acquisition. Our experiments show that it can acquire intricate structures from raw data, including transcripts of parents' speech directed at 2- or 3-year-olds. This may eventually help researchers understand how children, who learn language in a similar item-by-item fashion and with very little supervision, eventually master the full complexities of their native tongue."

In addition to child-directed language, the algorithm has been tested on the full text of the Bible in several languages, on artificial context-free languages with thousands of rules and on musical notation. It also has been applied to biological data, such as nucleotide base pairs and amino acid sequences. In analyzing proteins, for example, the algorithm was able to extract from amino acid sequences patterns that were highly correlated with the functional properties of the proteins.

The new method was developed jointly with David Horn and Eytan Ruppin, professors of physics and computer science, respectively, at Tel Aviv University, and with Zach Solan, a doctoral student there and the lead author on the paper. Their collaboration with Edelman was supported in part by the U.S.-Israel Binational Science Foundation.

About the Author
©2005 All rights reserved

  Click here to see related videos
More articles
Forms of matter
Electron characterictics
Data transmition
Magnetic properties
Fibonacci sequence
Ancient societies collapse
Language learning method
Light behavior
Atom measurements
Math and physics links
Universe spatial dimensions
Artificial materials
Brownian motion theory
Turlubent flow
Atoms' properties
Fuel tank capacity
Antimatter research
Electrical discharges
Neutrinos mass
Mathematics crises
Quotes
Have you seen a tall, lanky dufus with a bird face and hair like the bride of Frankenstein?-Elaine, describing Kramer

Having major planets disappear is always a bad sign.- Jim Blinn

Great minds have purposes, others have wishes.- Washington Irving

Great spirits often meet violent opposition with mediocre minds - Albert Einstein


Writers
If you are a writer and want to see your article published at Theallineed.com, just click here to submit.

Info
Today...
In the news...
Exhibition of the Great Seal of the United States opens at Independence National Historical Park, Philadelphia
Independence National Historical Park is hosting the landmark exhibition Celebrating the 225th Anniversary of the Great Seal: Past, Present and Future.
Which browser do you use the most?
Internet Explorer
Mozilla Firefox
Netscape
Opera
Other
 
Things to ponder
You will never find anybody who can give you a clear and compelling reason why we observe daylight savings time.

Did you know...
No NFL team which plays its home games in a domed stadium has ever won a Super bowl, until the St. Louis Rams in 2000.

Quote of the day
A magician pulls rabbits out of hats. An experimental psychologist pulls habits out of rats.
Anonymous

Featured article
Theft prevention accessories for PDAs
Losing a PDA through theft is one of the biggest challenges owners of personal data devices face. As the prices and capabilities of theses units increases, so does the market for stolen PDAs, year after year.

 
© Lexur