First, we analyze the weather attributes for the day and location of the Yelp reviews.
The average temperature is 21.3°C (70.3°F). The maximum temperature is high, at 40.3°C/104.5°F,
however, in Section 6 this is determined not to be an outlier. The average precipitation is 2.78 mm.
The maximum is high, at 206.2 mm (8.1 inches). This will also be discussed in Section 6 and determined
not to be an outlier. For most days, the precipitation is zero. For
SNOW, the average is 0.1, but
again for most days it is missing or zero.
Next, we analyze the distance and location attributes in Table 5.2.
Dist is the distance in miles
from the restaurant location to its closest weather station. The maximum distance allowed was 69 miles to assure
that the station and restaurant are close enough to have similar weather. The average station is 6.6 miles away.
The furthest is 64.7 miles. The
state attribute shows that we have Yelp review data for restaurants in 47
different states, with the most from California. The states that are not included in this analysis are: Montana,
South Dakota, and North Dakota. To obtain the Yelp data, we passed the top 300 most-populous counties to the
Yelp API, which did not include any counties from those three states.
Next, we consider summary statistics for the restaurant and review attributes in Table 5.3. The median Yelp review
rating,
review_rating, is 5-stars. At the restaurant-level, the median rating,
rest_rating, is 4.5-stars.
The review ratings are
int64, since the only possible values are 1-5 stars, while the restaurant ratings are
float64 since they are averaged over many reviews. The average restaurant is also highly-reviewed, with
236 reviews, as shown for
review_count. The
price attribute is a categorical variable, with the most
common restaurant being '$$' in price.
Lastly, we analyze the attributes from the sentiment analysis. The most common sentiment is positive, at 1,
with an average polarity of 0.3. The standard deviation of polarity is 0.3.