Sponsored Links

Senin, 07 Mei 2018

Sponsored Links

Time to Rhyme: The CMU Pronouncing Dictionary and You - YouTube
src: i.ytimg.com

The CMU Pronouncing Dictionary (also known as CMUdict) is an open source pronouncing dictionary originally created by the Speech Group at Carnegie Mellon University (CMU) for use in speech recognition research.

CMUdict provides a mapping orthopraphic/phonetic for English words in their North American pronunciations. It is commonly used to generate representations for speech recognition (ASR), e.g. the CMU Sphinx system, and speech synthesis (TTS), e.g. the Festival system. CMUdict can be used as a training corpus for building statistical grapheme-to-phoneme (g2p) models that will generate pronunciations for words not yet included in the dictionary.

The most recent release is 0.7b; it contains over 134,000 entries. An interactive lookup version is available.


Video CMU Pronouncing Dictionary



Database Format

The database is distributed as a plain text file with one entry to a line in the format "WORD  <pronunciation>" with a two-space separator between the parts. If multiple pronunciations are available for a word, variants are identified using numbered versions (e.g. WORD(1)). The pronunciation is encoded using a modified form of the ARPABET system, with the addition of stress marks on vowels of levels 0, 1, and 2. A line-initial ;;; token indicates a comment. A derived format, directly suitable for speech recognition engines is also available as part of the distribution; this format collapses stress distinctions (typically not used in ASR).


Maps CMU Pronouncing Dictionary



History


Allison Parrish | @aparrish@mastodon.social on Twitter:
src: pbs.twimg.com


Applications

  • The Unifon converter is based on the CMU Pronouncing Dictionary.
  • The Natural Language Toolkit contains an interface to the CMU Pronouncing Dictionary.
  • The Carnegie Mellon Logios tool incorporates the CMU Pronouncing Dictionary.
  • PronunDict, a pronunciation dictionary of American English, uses the CMU Pronouncing Dictionary as its data source. Pronunciation is transcribed in IPA symbols. This dictionary also supports searching by pronunciation.

Kate Compton on Twitter:
src: pbs.twimg.com


See also

  • Moby Pronunciator, a similar project

Project MELT Midterm Report ZACHARY LYTLE รข€
src: images.slideplayer.com


References


Matt Siegel (@mattsiegel) | Twitter
src: archive.li


External links

  • The current version of the dictionary is at SourceForge, although there is also a version maintained on GitHub.
  • Homepage - includes database search
  • RDF converted to Resource Description Framework by the open source Texai project.

Source of the article : Wikipedia

Comments
0 Comments