tag:blog.datamuse.com,2013:/posts Datamuse blog 2024-11-01T15:21:10Z tag:blog.datamuse.com,2013:Post/1924382 2023-01-02T19:46:20Z 2024-07-27T20:53:07Z Our 2022 Waterloo co-op term: In which organising words is a labour of love!

This past Fall, Datamuse hired three talented computer science students from the University of Waterloo in Ontario, Canada. Over the course of their three-month co-op they improved our word-finding apps and built a new game. These projects required a diverse set of skills, spanning topics in natural language processing (NLP), data visualization, distributed data processing, design, and web application development. In this post, we will showcase eight of the projects the students completed during the Fall term.  We're grateful to HannahMax, and Nicole, for all their hard work over the past few months! 


1. Concept clusters


Peter Mark Roget, an English physician in the 19th century, is the mastermind behind the thesaurus that we all know and love. Outside of his medical practice, he dedicated his life to curating and categorizing words, eventually publishing the "Thesaurus of English Words and Phrases" which grouped 15,000 words into more than 1,000 categories. This work laid the foundation for the synonym dictionaries that writers use today to find alternative words. While the internet now provides many tools (including our own OneLook Thesaurus) that can find synonyms and related words, it's hard to find the kind of organized taxonomy that Roget created. That's why we added "concept clusters" to OneLook – groups of related words and phrases that are automatically derived from data.  If you've used OneLook, you may have noticed these clusters at the bottom of every results page and in every word card that appears when you click on a result.  Last Fall, our co-op students Max and Nicole used modern AI tools (specifically the GPT-3 large language model from OpenAI) to come up with new titles for our 8,000+ clusters covering nearly a million words and phrases.  They evaluated the titles for accuracy, specificity, and conciseness, resulting in a set of titles that had 70% fewer duplicates than before.  They also used hierarchical clustering techniques to organize the concept clusters into 30 broad top-level subjects, like biology and art, which are now displayed on the OneLook homepage in our new "subject index."  It's now easier to explore words in highly specialized topics such as "turning point in life" (with 54 words) or "delivering a sermon" (50 words) or "manipulating audio signals" (92 words).


2. Word usage timelines

 When learning a new word or phrase, it's helpful to understand not only its meaning, but also how, when, where, and why it's used. The "when" aspect is especially important for fiction writers who want to avoid using words that are out of place for the time period they are writing about. To help with this, Nicole and Max used data from Google Books NGrams to build a lightweight interactive timeline that shows how frequently a word has been used in the past. These timelines can be found in the "word cards" on OneLook Thesaurus or certain parts of RhymeZone when you click on a word. By selecting one of the bars on the timeline, you can see a word that is more appropriate for the selected time range, if applicable.


3. Verbloom

 The success of Wordle has shown that there’s a tremendous appetite for fun, quick word games that are updated daily.   We’d like to create such a game that also teaches vocabulary in some way:  most word games, after all, are more focused on letters than on meanings.   Verbloom is a game that asks you to find the 3 words that are most related to each other within a crowd of unrelated words.  Finding the answer will often lead you  to discover surprising meanings for words you thought you knew well.   The entire co-op team  – Hannah, Max, and Nicole – collaborated to create this game, from its beautiful visual design, to its front-end, back-end, and daily game generation algorithm.  We hope you’ll enjoy a game or two. We expect to offer subject-themed versions of this game for educators in 2023.


4. Idiom usage examples


RhymeZone and OneLook, like many dictionaries, provide usage examples that show how a word is used in context. These examples come from real sources such as famous quotes, books, and Wikipedia articles, and are chosen based on research we conducted in 2020 to best illustrate the meaning of the word. However, until recently, our usage examples only worked for single-word terms and certain two-word compounds, and did not include idiomatic phrases like "time of one's life" or "bee in your bonnet" or "small potatoes." These idioms are particularly important for our non-native English speaking readers, so adding usage examples for them was a top priority. Max and Hannah reimplemented our data pipeline to cover idiomatic phrases and variations of phrases in an efficient manner. Now, when you search for "time of one's life," you will find a usage example featuring George from Harry Potter: Goblet of Fire saying "Yeah, we're having the time of our lives here."


5. Word card redesign  

How can we organize the most important information about a word in a way that promotes learning and discovery? "Word cards" appear when you click on a word on OneLook Thesaurus or in the Thesaurus tab on RhymeZone, and in the future they will be available in other parts of our apps. Word cards display definitions, usage examples, and links to additional resources for learning. Nicole redesigned the look and feel of the word cards, giving them a fresh, icon-based layout. The cards also have interactive elements, such as the ability to reorder usage examples based on the selected definition by clicking on it.


6. Word of the Day 

Several dictionary websites offer a “word of the day” feature.  Ours will be different from others – our words will be selected based on what’s currently happening in world news. Nicole implemented a data pipeline that crawls recent news stories to find words that are both interesting and meaningfully connected to recent news, maximizing the chance that a word is selected which teaches the user something new.  Look for this feature in early 2023.

7. Humour ranking


What words are inherently funny?  Why is “gobbledygook” a funnier way to say “nonsense” than “nonsense” itself?   Why is “yam” a funny vegetable, but “celery” is not?   Believe it or not, NLP researchers have considered these questions.   We wanted to add the ability to sort words by funniness to our recently-added list of sort orders in OneLook Thesaurus (and in RhymeZone’s thesaurus tab).  So Max implemented a model from this paper that ranks words by their inherent funniness, and hilarity ensued!  Try it by selecting the “Closest meaning first” select menu on the Thesaurus after you do a search and then selecting "Most funny sounding".   (Max also created a standalone website called FunnyBone that shows off the work; check it out!)


8. OneLook Thesaurus API tests and explainer video

OneLook Thesaurus has grown in complexity over the years as it offers ever more ways to find words, phrases, and ideas.  To help us rein in this complexity, Hannah created a battery of acceptance tests that exercises different aspects of our backend, helping us prevent software regressions from reaching our production service. This has already been useful in keeping the service stable, and it will be invaluable as we add more languages in 2023.    She and the others also helped edit this video that goes over some of the quiet joys of using the English version of OneLook Thesaurus.   Give it a watch!


]]>
tag:blog.datamuse.com,2013:Post/1765405 2021-12-01T15:00:06Z 2024-11-01T15:21:06Z 25 Years of RhymeZone and OneLook: What we're musing on lately

Our word-finding websites RhymeZone and OneLook both turned 25 this year—nearly the age of the Web itself!—and they're reaching an ever-growing audience of writers, poets, students, puzzle-makers, marketers, and scholars.  We're still building new features for these sites and others, all of them geared toward helping the world find words and ideas more effectively.  Recent advances in AI make this an exciting time to work in this area!    

Our last post to this blog was in 2016 (!), so we’re overdue for an update on what we’ve added since then. Here are 10 highlights from the past 5 years:

  1. OneLook Thesaurus: This site is a rewrite of the "reverse dictionary" tool that we made way back in 2003, and our take on what a thesaurus can be in the present day.  Its single-minded purpose is to inspire you with the right word, whatever your task, and however you're able to describe the word to us.  Like an old-time thesaurus it lets you find synonyms, but it also lets you find related words and ideas more effectively.   You can search for words with natural descriptions like "process that keeps plants green" or "expressing frustration or impatience with someone" or "types of hard wood".  It's gotten dramatically better at this in the past year, so give it a try if you haven't used it in awhile.  Read some reviews in AskReddit, MakeUseOf, LifeHacker, and Forbes.

    The thesaurus features are also integrated into RhymeZone in the "thesaurus" tab, where there are special treats for poets such as a "meter bar" that reveals the words that match the rhythm you need.

  2. Google Docs add-on:  If you use Google Docs regularly you might enjoy using our Docs add-on, which lets you highlight words or regions of your text and find synonyms and related words using the same technology described above.  It's branded as the "OneLook Thesaurus" add-on, but it includes many of our other services, too, such as rhymes, adjectives, and quotations. It's now the most popular thesaurus add-on for Google Docs.  Check out the reviews at EmergingEdTech, PCMag, and MakeUseOf!

  3. Advanced search for RhymeZone:  RhymeZone's user interface hasn't changed much since the 2000s, and the simple lists of words it returns are how many people seem to like it.  For those who want more detail, songwriters especially, we built an advanced search option. This column-oriented interface offers a much larger set of possible results and more ways to sift through them, with a deeper integration of meter, lyrics, definitions, and popularity data.

  4. Spruce:  Introduced in 2020, Spruce helps you find quotations, lyrics, jokes, and proverbs that are related to the topics in your writing.  These can strengthen your arguments or spark new, unexpected ideas.  What makes it more powerful than a keyword-based quotes search engine is that it tries to synthesize all the ideas in your text to come up with suggestions that are relevant to your specific themes. Spruce is available as a website, a Chrome extension, and as part of the Google Docs add-on mentioned above, where it's easier to select large blocks of text.  Click the question mark on the Spruce web page for more information on how it works.

  5. Rimar.io and OneLook Tesauro:  To break out of an English-only mentality, in 2017 we made a Spanish-language version of RhymeZone called Rimar.io, and more recently we started testing a Spanish version of OneLook Thesaurus with all the same features as the English version.  There's a lot of work to do to get Datamuse services working in other languages, but we’re up to the challenge.

  6. Mentions” on RhymeZone:  When you're trying to learn a new word, the Mentions tab on RhymeZone provides engaging examples of the word in a sentence.   It’s based on research we conducted last summer to determine which factors make usage examples compelling and memorable.  Our software combs through 30,000 public domain books and millions of lyrics, quotes, and Wikipedia articles to find good examples for every word, no matter how rare.

  7. Lyrics features:  RhymeZone's "Lyrics and poems" tab now includes rhymed verses sampled from 2 million songs and poems in English and Spanish, organized by genre and rhyme pair.  You can do conceptual searches, too, by searching for a whole line—for example, searching for “it won't stop raining” will find hundreds of different ways this thought has been expressed in song.

  8. Mobile app updates:  Our RhymeZone mobile apps for iOS and Android, both highly rated at 4.8 stars, give you a slimmed-down version of RhymeZone you can use without an Internet connection, with ad-free access to the website’s more expansive features when you’re back online.  Over the last few years we’ve expanded the built-in vocabulary of the app and made several design enhancements to both apps, especially to the iOS version: a dark mode option, a search history feature, and easy access to Apple’s dictionary definitions.

  9. Datamuse API:  On a good day our free developer API serves 200 queries per second from several hundred educational apps, games, and search engines, such as Flocabulary, RapPad, Voice Dream WriterSmashWords, LyricStudio, and DomainWheel.  In the past few years we’ve added more word metadata, wildcard patterns, Spanish support, and pronunciation features.

  10. Contests:  We've run 8 different themed poetry contests on Twitter in the past 5 years, with cash prizes going both to the winning poets and to non-profits that they can designate.  Our latest contest in early 2021 was on an unusual but urgent theme: COVID-19 vaccines. We’ve donated to 19 non-profits over the years, including 9 this year, based on the selections of the winners. Follow us on Twitter for updates on these contests and other news about our sites. 



Do these sorts of applications interest you?  We’re looking for computer science students and seasoned programmers to help us build more and better creativity tools in 2022. Do you have experience in, or enthusiasm for, natural language processing, machine learning, or data visualization? Are you available for contract work in 2022?  
Contact us!


Contributors:   Doug Beeferman, Harvey Beeferman, Linus Wong, Jonah Fried, Castedo Ellerman, Fritz Holznagel


]]>
tag:blog.datamuse.com,2013:Post/1113958 2016-12-08T20:07:19Z 2024-11-01T15:21:10Z Datamuse API v1.1 The Datamuse API, our word-finding tool for developers, is now a year old! It now answers nearly 3 million queries each day from Datamuse's websites OneLook and RhymeZone, as well as from educational apps like Voice Dream Writer, RapPad, and Flocabulary.

A new minor version of the API, 1.1, is publicly available today. This upgrade includes some accuracy improvements to the "means like" (ml) and "sounds like" (sl) constraints.   It also lets developers get some useful metadata about the words and phrases that come back from /words queries: pronunciation, definitions, word popularity, and broad syntactic categories. 

These metadata fields can be used to segment, order, or filter results for end users. For example, the OneLook Thesaurus uses the metadata to segment the answers by part-of-speech and show definitions when the user clicks on a word. The new RhymeZone advanced search interface uses the word frequency data to break ties when sorting through the rhymes for a given word. 

For more info, please see the Datamuse API docs, in particular the new section on metadata flags. If you want to get started using the API, consider using one of the client libraries that developers have added over the past year. 

Note that the API is still geared for English only, but more languages are coming in the next year.

]]>
tag:blog.datamuse.com,2013:Post/996280 2016-02-18T16:20:15Z 2023-10-31T02:01:30Z Voice-based writing help with the RhymeZone app for Alexa

If you own an Amazon Echo or another Alexa device such as a Fire TV, try out the new Alexa skill for RhymeZone!

This skill lets you find words from the comfort of your couch by shouting out a command.  You can say things like "Find words related to dog" or "Give me a 6-letter word for penguin" or "Find rhymes for cheese" or "Find adjectives for strawberry" and get back a rapid-fire list of matches.

This is useful when you're in the midst of writing and you don't want to interrupt your typing or handwriting flow to look for alternate words in your favorite Web-based or tree-based reference tool.  Out of the box Alexa already lets you ask for word definitions, but the RhymeZone skill gives you several more options: You can ask for "rhymes", "related words", "synonyms", and "adjectives".

These options are mostly self-explanatory, except for "adjectives". "Adjectives" uses the new "descriptive words" feature on RhymeZone to give you words that commonly modify a given noun.  The "strawberry" example above will give you choices like "wild", "luscious", "juicy", "fragrant", and dozens more.

You can qualify any query with a starting letter.  For example, saying "Find words related to dog that start with 'P'" will give you such words as "pug", "poodle", "paw", "pet", "pound", and "Pavlov". You also can add a restriction on the length of the word, as in the 6-letter penguin example above, which returns "gentoo", "adelie", and a few others.  These two features might be useful for getting help with crossword puzzle clues — use them at your own risk, since you're only cheating yourself!

For all of the supported query types, the answers that come back are ordered by how popular they are.  Since some queries produce a lot of answers, Alexa will recite them 10 at a time and give you the option of flipping through the results by saying "continue" after each set of 10.

To install the RhymeZone skill on your Alexa device, enable it from this this page.  (If that doesn't work, see more detailed instructions from Amazon here.)   Enjoy!


Thanks to Norbert Burger for suggesting the skill, and for guidance on publishing Alexa skills.

]]>
tag:blog.datamuse.com,2013:Post/980380 2016-01-28T15:10:43Z 2023-12-03T04:21:03Z RhymeZone Turns 20 (with updates aplenty)

rz6_large 1png

This week RhymeZone, the rhyming dictionary and thesaurus website, turns 20 years old!  To celebrate, there are several new features to announce.

Some history: I created the “Semantic Rhyming Dictionary” while a student back in 1996, renaming it as the more euphonious "RhymeZone" in 2000. Since then the site has answered billions of search queries from tens of millions of creative people around the English-speaking world: songwriters, copywriters, poets, pranksters, puzzle solvers, and more. It's been the subject of jokes, songs, and copious praise (and some parody, too).  While no reference tool can match the power of the human imagination, my hope is that RhymeZone can assist and augment. Think of it as a colorful companion on your writing excursions.

20 years on, RhymeZone is still a work in progress. While I've made occasional tweaks to the site over the years, I’ve only recently started to invest in more substantial improvements. In this post I'd like to highlight the 7 recent developments I'm most excited about.


1. Lyrics and poems

Alongside your rhymes, RhymeZone now shows you excerpts of poetry and song lyrics that illustrate how your word has been used in rhymes by well-known poets and musicians.

Depending on your writing goal, this new feature can be useful for harvesting more ideas for your work, for steering clear of well-worn rhymes, or for meditating on all the imaginative ways that a topic has been treated in existing music and poetry.

By design, the examples span myriad genres — musical theater, hip hop, pop music, nursery rhymes, classical poetry, and Shakespeare, to name a few.  More than a million songs and poems have been scanned from the Web, and the verses are selected and sorted for each query to prioritize diversity and notability.

Here's how it works on your side of the screen. When you do a normal RhymeZone search for a word such as "beach," a verse will appear in a green box like the one below:

Screenshot 2016-01-25 084803png

You can flip through up to 200 examples by clicking the “↻” icon, or see them all on one page by clicking on the "200 examples" link.  (You can also get to this page using the "Show poetry and lyrics" dropdown option that appears on the desktop website, or by clicking on the "Lyrics and poems" link that appears the top of any search results page.)

On the lyrics page for “beach” you'll see this Longfellow verse together with a wide range of strange bedfellows:  verses from Lewis Carroll, Bob Dylan, U2, the Ramones, Iron Maiden, Gorillaz, and dozens more.  Click a title to visit the most authoritative page about the work (according to Google's "I'm Feeling Lucky" feature), often a music video, Wikipedia article, or the artist’s own page.

You'll notice that the majority of these "beach" verses (142 out of 200, or 71%, to be precise) pair it with the word "reach."  This means if you're aiming to be less predictable in your songwriting you might look beyond "reach."  Fortunately many other choices are within, er, reach, and you can see them by clicking on the grey dropdown box that says "Filter by rhyme...":

Screenshot 2016-01-25 084926png

Here you can narrow down the list to the ones that use some of the less-typical rhymes like “peach” and “teach.”  For example there's Syleena Johnson's pithy (but not pitty) couplet from More: "Like sand to a beach / The sweet to a peach."

Of course, you can see hundreds of other rhyming words for “beach” on the regular RhymeZone search results page where you started, though not all will have as many good example verses.  Like “pleach," a verb that means to intertwine, or “medicinal leech.”  There’s surely a song in that!

The lyrics feature is also good at revealing imaginative multi-word rhymes (sometimes called broken rhymes) as well as near rhymes that match the target word imperfectly.   For example, the word "nocturne" (a sad piano piece) has no perfect single-word rhymes, but Stephen Sondheim paired it cleverly in A Little Night Music:

Screenshot 2016-01-25 085034png

And then there's my favorite rhyme in musical theater, from Stephen Schwartz's Pippin, which comes up when you search for "massacre":

Screenshot 2016-01-27 213415png

The system finds rhymes even when they’re hiding in front of a common final word, as in this couplet that comes up for “shake”:

Screenshot 2016-01-27 214153png

Not to mention internal rhyme, where two or more rhymes are confined within the same line.  For example, in the results for “missing”:

Screenshot 2016-01-27 213814png



2. Near rhymes and broken rhymes

Rhyme is a candy sampler of many rich flavors. True or perfect single-word rhyme was the only flavor offered by RhymeZone for a long time, but in recent years I've gradually added more kinds of near rhymes (also called oblique or false or slant or imperfect rhymes) to the search results.  These near misses, useful in some contexts but not others, are given their own section titled “Words and phrases that almost rhyme."  They’re also available from the tab labeled “Near rhymes”:

Screenshot 2016-01-25 222600png


Let’s look at a few examples of near rhymes in the wild:

  • “Highness” doesn’t rhyme with “finest” in the strictest sense — the final consonant sounds don’t match — but in many contexts it’s a delightful pair, such as in a Danny Elfman song from Corpse Bride:  “Rubbing elbows with the finest / Having crumpets with her highness."  

  • Similarly, “forest” and “chorus” aren’t perfect rhymes for the same reason, but they pair well together because the two concepts co-exist well in nature, as in Cactus Tree by Joni Mitchell:  “He has missed her in the forest / While he showed her all the flowers / and the branches sang the chorus...."

  • Would it seem suspicious to rhyme "dishes" with “delicious”?  The words sound alike but have slightly different endings (the “s” is delicious is voiceless, while the final “s” in dishes is voiced), so they’re technically imperfect rhymes.  But it’s close enough that even pure-rhyme Sondheim used it in A Funny Thing Happened on the Way to the Forum:  “Wouldn’t she be delicious / Tidying up the dishes”.   (It should be noted that he was sheepish enough about this choice to point out the imperfection in a footnote in Finishing the Hat.)  

  • By contrast, “calling” and “morning” seem very weak as potential rhymes when they’re pronounced in isolation — the stressed syllables are totally different.  But listen to Kendrick Lamar’s intro to HiiiPower (“The sky is falling, the wind is calling / Stand for something, or die in the morning”) before passing judgment:  it works.  

In all these cases you can see how much the context matters. What makes near rhymes hard from RhymeZone’s perspective is that there are often thousands of possibilities that might work, but most of them are junky. If RhymeZone printed out all of the words that sound as far-off as “morning” when someone searches for “calling," it would take the user hours to read through the results.  So it’s not enough merely to apply purely phonetic rules. To decide which near rhymes are sensible together, RhymeZone now uses several other data sources as well:  song lyrics, word sequence frequencies (from Google Books Ngrams), and the search activity on RhymeZone itself going back more than a decade.

The same thinking applies to multi-word (broken) rhymes, even when there’s a perfect match.  For example, let’s return to Sondheim’s nocturne / clock turn rhyme from earlier in this post.  RhymeZone spits out “clock turn” and “lock turn” and “rock turn” and a couple others when you search for “nocturne,” but it avoids suggesting meaningless phrases like “spock tern” or “glock stern” or “smock spurn” since it has data about which word combinations are plausible.

For some words RhymeZone still does suggest a fair degree of nonsense, and it fails to find good matches for others.  For instance, when you search for “personable,” it fails to find Sondheim’s masterful “coercin’ a bull."  The highest priority for RhymeZone is to increase the breadth and accuracy of these near rhymes and broken rhymes, since they’re crucial to so many songwriters. 


3. Filter by meaning

Many words have hundreds or even thousands of reasonable rhymes, even when you only count the "perfect" ones.  In such cases it can be helpful to narrow down your choices by meaning, because you usually have some idea of the kind of word you're looking for.

When many words rhyme, RhymeZone will now show you a little grey box at the top right of the results section that looks like this:

Screenshot 2016-01-25 085355png

If you click it and type a topic, RhymeZone will do its best to filter the rhymes and highlight the words it thinks are related to the topic.

Suppose you're Samuel Taylor Coleridge writing the famous unfinished narrative poem Christabel but you're stuck on the final words of two lines:

She stole along, she nothing spoke,

The sighs she heaved were soft and low,

And naught was green upon the [ ? ]

But moss and rarest [ ? ]


Here Coleridge needs a rhyme for "spoke" that means some kind of tree, and he needs a rhyme for "low" that means something related to plants.  If you don't already know the poem, can you guess what he chose?

Search for rhymes of "spoke" and then enter "tree" into the box, and RhymeZone will highlight the winner, "oak," and many varieties thereof.

Screenshot 2016-01-25 085306png

Similarly, a rhyme for "low" that means something related to plants?  You'll get such words as "grow," "aloe," and Coleridge’s choice, "mistletoe."


4. Descriptive words

Another recent addition to RhymeZone is the tab called "Descriptive words," which lists adjectives that are well-suited for a given noun, and nouns that are well-suited for a given adjective.  This is particularly useful for adding imagery to your writing.  

For example, suppose you’re trying to describe mistletoe dramatically, as Coleridge was in Christabel.  What are some adjectives that occur to you?  “Green” and “leafy” occur to me.   These adjectives are relevant to mistletoe but might be too plain for your needs.  Using this new RhymeZone feature you can tap the collective wisdom of humankind to get some more refined ideas:

Screenshot 2016-01-25 173630png

You'll find Coleridge’s “rarest” is there in the list, as well as some other positively charged adjectives like “mystic” and “hallowed” and “holy."  If you were writing a horror story you might prefer “baleful” or “accursed” or “withered."   (“Baleful," by the way, is probably on the list because of Shakespeare’s characterization of mistletoe in Titus Andronicus.)

You can go in the other direction, too — that is, you can find the nouns that are popularly described by a certain adjective. For example, what things are described as leafy?   Here's what:

Screenshot 2016-01-25 173716png

This feature gets its smarts from the Google Books Ngrams data, a publicly available  analysis of millions of English-language books written over the past few hundred years.


5. Smart Suggest

As you're entering a word into the RhymeZone search box you'll see a familiar "autocomplete" dropdown that shows you our best guesses as to what you're starting to type.  This will save you from having to type out the entire word, which is particularly timesaving when you're on a mobile device or unsure of the spelling of the word.

Autocomplete and autocorrect are nothing new, but I'd like call attention to some features of RhymeZone's autocomplete that make it particularly useful for writing.  For one thing, it's very resilient to typos.  For another, it has a couple of fun shortcuts. If you type a word or phrase followed by a question mark ("?"), the autocomplete box will show you terms contextually related to that word or phrase.  This can be a good way to explore alternative directions in your poetry or prose.  For example, suppose you're writing a poem about sitting in front of a crackling fire on on a winter evening, and you want to bring some chimney imagery into the picture.  Type "chimney?" and you'll get words like "soot," "flue," and "sweep" in the box.

Screenshot 2016-01-25 085533png

You can also use the asterisk ("*") symbol in the search box to act as a placeholder for any number of letters.   If you type "*nace," for example, you'll see a list of words that end in "nace" like "menace" and "furnace." Or if you're looking for words that start with "a" and end in "ation," type "a*ation" to get choices like "aspiration," "alliteration," and "accommodation."  (This is particularly useful for generating candidates for, well, alliteration — repetition of the starting sounds in a sequence of words.)

By the way, it's a passion of mine to get autocomplete working well on search engines of all kinds, especially on dictionary-oriented sites. If you own or know of a site where this might be helpful, point them to this service.  Also, I can't possibly discuss the topic of autocomplete without referencing this excellent webcomic (note: not safe for most workplaces).  


6. API

“Developers! / Developers! / Developers! / Developers!” goes a famous quatrain by an ancient master of identity rhyme, Steve Ballmer.  Over the years we've gotten hundreds of requests from developers wanting to use rhymes and synonyms and such in their websites or mobile apps. It's not hard to create your own basic rhyming dictionary and it's a good programming exercise to do so, but many of the features in RhymeZone (such as the near rhymes and the "meaning" filtering described above) depend upon a large amount of server-side data that you may not want to reproduce on your end.

If you're a developer you may may be interested in the JSON API we recently added.  This API gives programmatic access to most of the functionality of both RhymeZone and its button-down sister site, OneLook, and lets you mash up the data in interesting ways.  You can use it in your apps without restriction for up to 100,000 queries per day.

If you’re interested, check out the complete docs here.


7. Poetry prize

Finally, Datamuse is happy to announce that the 2016 RhymeZone Poetry Prize is now underway.  This year's contest encourages people of all skill levels to write poems on the theme of Community.

The RhymeZone Poetry Prize is somewhat unusual for a writing contest in that there are no strings attached: There are no entry fees or restrictions of any kind other than legal eligibility requirements.  Submissions are posted publicly to the RhymeZone Forum community, where fellow authors can read and critique your work.  And you don't have to rhyme!

Last year’s contest was a great success with more than 3000 authors contributing poems.  We look forward to lots more thought-provoking verse this year.  

The deadline to is April 12, the middle of National Poetry Month in the U.S. and Canada (when we plan to have some more RhymeZone updates to report.)  See this page for the complete rules and guidelines.


Thank you!

Special thanks go to Harvey Beeferman, Castedo Ellerman, Fritz Holznagel, John Knowles, Linus Wong, Vitus Wong, and many others who have contributed to making and moderating RhymeZone, the RhymeZone mobile apps, RhymeZone Forum, and the Poetry Prize over the years.   And thank you, gentle readers and rhymers, for your two decades of support and feedback!  

That’s it for now!   See you in April.

Doug Beeferman


]]>
tag:blog.datamuse.com,2013:Post/952164 2015-12-18T01:34:39Z 2023-05-06T23:21:26Z Announcing the Datamuse API

The first version of the Datamuse API is now generally available and free for everyone to use!   See http://www.datamuse.com/api/ for more info, including complete docs and examples.

If you're a developer interested in adding some kind of word search feature to your app, you might find the API useful.  It's already been used in two assistive writing tools, a few search boxes (for intelligent autocomplete), and word games.  It also powers key parts of RhymeZone and OneLook.   

Over the past 20 years we've received hundreds of requests for some kind of programmatic access to the data behind our sites from developers eager to make their own writing-oriented apps.  This API finally gives you access to this data, and much more.  We're excited to see what else people create with this service, and we look forward to adding more apps of our own in the near future. 

]]>
tag:blog.datamuse.com,2013:Post/840405 2015-04-14T05:16:43Z 2023-10-24T08:32:44Z Results of the RhymeZone 2014-2015 Poetry Prize

In March we announced the 10 winners of the first RhymeZone poetry prize, as well as 8 honorable mentions.   Please visit http://www.rhymezone.com/contest/ for all the details.   Also, check out this article in the Salem Statesman Journal describing one of the winners.

The Poetry Prize was a great success and we look forward to making it a regular event.  Stay tuned later this year for more information on the next contest.

]]>
tag:blog.datamuse.com,2013:Post/754515 2014-10-13T00:10:23Z 2023-11-20T12:42:55Z Announcing the 2014-2015 RhymeZone Poetry Prize RhymeZone will make unconditional grants totaling $5,000 to 10 authors of thought-provoking poetry in the United States or Canada!

The submission deadline is Sunday, February 1, 2015 and winners will be announced on Sunday, March 1, 2015. 

Why a poetry prize? It's our way of saying thank you to the poetry community which has been so supportive of RhymeZone over the years — and also a way to encourage the writing of more great verse. And we also just think it will be fun.

Please see http://www.rhymezone.con/contest for more information about how to enter, including contest rules and guidelines. 

]]>
tag:blog.datamuse.com,2013:Post/638231 2014-01-06T03:27:56Z 2024-01-24T19:15:18Z Finding Topeka: OneLook adds part-of-speech filtering

OneLook is proud to announce the arrival of a much-requested feature: filtering by nouns, adjectives, verbs and other parts of speech.

OneLook is a power tool for finding and learning about English words. For the past 18 years we've served scholars, writers, medical transcriptionists, crossword puzzle enthusiasts, language learners, and marketing professionals around the world.  We started as a “meta-dictionary”, a place to find all the different definitions of a word on dictionaries and glossaries across the Web with just one lookup -- hence the name.   

There were relatively few online dictionaries back in 1996, but these days OneLook indexes more than a thousand of them, including nearly 20 million definitions of more than 9 million unique words and phrases in the English language.   

Wildcard features

Sometimes you don’t know the word you want; or you’re looking for a variation of a word or phrase or letter sequence that you do know; or you know part of the word you’re looking for, but can’t remember the whole. Over the years we've added “wildcard” and “reverse dictionary” features to OneLook to address these needs, including a query syntax that lets you find words quickly from OneLook’s large vocabulary.  

The example searches on the homepage show you how to use the basic wildcard features.   Unique on the Web for their flexibility and speed, these features have become the most frequently used function of OneLook, especially as other sites (like Google) now handle regular “forward” dictionary lookups more comprehensively.

Searching for a few good words

With so many sources indexed by OneLook, a lot of wildcard searches produce too many results to be useful. For example, can you think of some words that begin with the letters “abst”?    I bet “abstain” and “abstract” come to mind, and maybe a couple others. OneLook finds a whopping 494 such words and phrases across all of the dictionaries it indexes. You can see them by doing a search for “abst*”. You’ll find “Abstergo Industries”, a fictional megacorporation in a videogame, as well as “abstravagant”, a neologism meaning “weirdly great” that appears only on UrbanDictionary, and “abstergent”, an old-timey word for cleaning.

No offense to Abstergo Industries, but if you’re on the hunt for just a simple word -- for a product name, crossword puzzle, or wedding toast, say -- then most of these 494 results are not useful.

That's why a long time ago we added filtering by “commonness” to help in these situations.    A yellow box like this one shows up after you do a wildcard search:

                      

If you click on the far right option (“Common words”), the list will be winnowed down to the subset of words that are considered “common”, which means they are found in a lot of different dictionaries on OneLook. There are only 30 such results for “abst*”. (Did you miss “abstruse”?)

New feature:  Filtering by part of speech

Still, 30 is a lot.  What if you know you’re looking for an adjective?  A new feature on OneLook lets you filter words by part of speech.  “Part of speech” refers to the broad syntactic category of a word.   You may know parts of speech by their street names: noun, adjective, verb, and so forth. The new filtering option appears right below the “filter by commonness” option.  It looks like this:

                      

If you click on “adjectives” your results will be further restricted to the subset of words that are known to be adjectives.  In the case of “abst*”, that leaves you with 8 choices, a manageable number to read through.

Suppose you’re looking for a place name that has 6 letters, starts with “t”, and ends in “a”?   This might be the case if, like me, you’re stumped on today’s New York Times crossword puzzle (24 across on the puzzle for January 5, 2014).   Filtering for such words gives you a handful of choices, among them the correct answer (spoiler alert!): “Topeka”.

Tips for effective part-of-speech filtering

  • Instead of clicking in the yellow box, you can access this feature by simply typing “:adjective”, “:noun”, “:verb”, or “:adverb” after your query, e.g. in a search for “abst*:adjective”.

  • For nouns, this feature makes a distinction between common nouns (such as giraffe) and proper names (such as Abraham Lincoln or Topeka), because you’re usually only looking for one or the other.   As a general rule, if you think the word you’re looking for would start with a capital letter if it were printed out in the middle of a sentence, choose “proper names”, otherwise choose “common nouns”.

  • Adverbs are an odd part of speech since they encompass several different kinds of qualifying words, so you may get some surprises.   If you’re a native English speaker, then in grade school you may have learned that adverbs always end in “ly”.  By now you know that’s not true -- that is, assuming you’re not in grade school any longer.     For example, if you scan the adverbs that begin with “a” in this list  you’ll find such terms as  “aboveboard” and “ad hoc,” which are, like “quickly,” valid ways to do things.  And there are 114 common nouns that do end in “ly”!

  • You may know that OneLook offers a reverse dictionary service that allows you to search for words by meaning. In addition to filtering “raw” wildcard searches, you can also filter reverse dictionary search results by part of speech. For example, a typical reverse dictionary search is “a*:love”, which asks for words that start with “a” and have a meaning related to “love”.   If you filter this result for verbs, “adore” will show up first in the results;  if you filter for adjectives, “amorous” will show up first;  and if you filter for nouns, “affection” will show up first.

  • Sometimes a word form can belong to multiple parts of speech; for example, “vacuum” is listed as both a noun and a verb. Can you think of a word that can be a noun, verb, adjective, or adverb? (Well, can you?)

Room for improvement

Purists will note that not every part of speech is available as a filtering option. Pronouns, prepositions, conjunctions, and interjections -- so-called “closed class” words -- aren’t so interesting because there are relatively few of them, and so they are not included in the filtering interface.

You’ll occasionally find errors in the new part-of-speech filtering feature.  In particular, proper names are not always recognized as such and are sometimes lumped in with common nouns.   Also, the reverse dictionary has some trouble with filtering polysemous words -- that is, words which have multiple senses, like "refuse."  You’ll notice on this page that we’ve asked for verbs related to "garbage" that begin with “ref*”, but, while the noun form of "refuse" is appropriate for this query, the verb form is not.   These glitches will be addressed over time.

Since we introduced wildcard matching a decade ago, OneLook users have requested part-of-speech filtering more than any other new feature.   We hope this change begins to address this need.  Please send us feedback if you have any comments about this feature or any other feature requests.


]]>