View on GitHub

Recipe Rescuer

Overview and Motivation

Have you ever hungrily looked inside your pantry or fridge, but you have no idea what you should eat? This is a (not very) serious issue affecting millions of people globally. The Recipe Rescuer system suggests recipes for you to cook based on ingredients that you already have on hand. The top 3 recipes matching the user's entered ingredients will be presented and ranked according to the number of additional ingredients that have to be purchased. Anyone who has ever needed a small push to start cooking a meal will recognize the need for, and will benefit from, a system like this.

The goals of this project were:

  1. To create a recipe suggestion service based on ingredients that are already in the user's home, based on recipes collected from dozens of well-respected cooking websites.
  2. To reduce food waste by encouraging people to make the most out of ingredients (especially perishable ones) in their home.

Initial Questions

The primary question we hoped to answer was, "Can you generate unique recipe suggestions based on a list of what ingredients are in your kitchen?" We wanted to maximize the usefulness of this question, so we focused on making sure that those recipes required a minimal amount of shopping for extra ingredients.

We also wanted to visualize our network of recipes. While we initially planned to use Plotly's 3D network graphs to do this, we determined that NetworkX better suited our needs. After initially graphing the network of a small subset of recipes, we quickly realized that graphing the entire network would result in an unreadable image. We modified our approach to this question by choosing to graph the subset of the network that contained the recipes that were suggested to a user.

As the project progressed, we decided that we also wanted to discover what the most versatile ingredients are - the ones that occur the most frequently across all of the recipes. We cared about answering this question primarily because it would allow us to tell people which ingredients they should always have on hand. Additionally, this allowed us to suggest recipes that use these food staples. Finally, we were also able to graph the most versatile ingredient and all of its immediate connections to demonstrate how important that ingredient is to cooking.

We were also curious about whether or not we could place ingredients into categories using Latent Dirichlet Allocation (LDA). This was an additional question that we attempted to answer near the end of the system's development. The extracted ingredient groupings could easily be identified as sweet or savory, but more detailed categorization wasn't possible.

Our Data

The recipe data for this project comes from a recipe database called Open Recipes on github (https://github.com/fictivekin/openrecipes). There are approximately 173,000 recipes in this data that come from dozens of well-known cooking websites. This data was an excellent find, but it required a great deal of cleanup to prepare it for our analysis.

Since there were no identifiable patterns to how the ingredients were listed, we had to search through every listed ingredient in every recipe to extract out the food item (ex. low fat no GMO organic milk -> milk). This was possible by creating a list of 1600 food items and checking to see if one of these food items were a substring of the listed ingredient. If none of the food items were a substring of the listed ingredient then that food item was still added. This was done so that data would not be completely lost.

This proccess took a great deal of time. The algorithm took approximately 25 hours even though we split up and ran different subsets of the data between three people. After running the data through this algorithm, finishing touches were added using regex patterns.

After the cleaning proccess, we were only left with approximately 90,0000 recipe entries since many of the data entries had incorrect data (ex. no food items were listed in ingredient column).

Final Analysis & Discussion

The primary goal of Recipe Rescuer was to create a recipe suggestion service based on ingredients that are already in the user's home, based on recipes collected from dozens of well-respected cooking websites. We were able to successfully achieve this goal through building a network of ingredients from 90,000 recipes. A user can enter a handful of ingredients they would like to use, and Recipe Rescuer uses this network to find and visualize 3 recipes that require the least amount of additional shopping. The network is also used to determine what the most commonly used ingredients are across all of the recipes. Finally, LDA was used to classify ingredients according to whether they are sweet or savory.

We learned that there are many diverse ingredients and recipes available on the internet. We also learned that staple ingredients such as salt, butter, and eggs can be combined in many different ways with just a few other ingredients to create many different meals. Through our network analysis, we learned how different ingredients are connected. Using the results of Recipe Rescuer, we realized how easy it can be to cook meals using just a few simple ingredients that you already have lying around.

Contributers

Amy Kruzick, Jacqueline Vital, Sagar Ali