Introduction

This document demonstrates my approach analyzing the dataset in the Uber Analytics Exercise.

Load Data and Take a Quick Glimpse

To begin the analysis, we load the .csv file into R workspace and check its structure and summary.

## 'data.frame':    336 obs. of  7 variables:
##  $ Date           : Date, format: "2012-09-10" "2012-09-10" ...
##  $ Time..Local.   : int  7 8 9 10 11 12 13 14 15 16 ...
##  $ Eyeballs       : int  5 6 8 9 11 12 9 12 11 11 ...
##  $ Zeroes         : int  0 0 3 2 1 0 1 1 2 2 ...
##  $ Completed.Trips: int  2 2 0 0 4 2 0 0 1 3 ...
##  $ Requests       : int  2 2 0 1 4 2 0 0 2 4 ...
##  $ Unique.Drivers : int  9 14 14 14 11 11 9 9 7 6 ...

Data Exploration

The data could be viewed as a funnel from Eyeballs(demand emerges) to Completed Trips, while Zeros result in potential drop-outs(turn off the App). Visualziation helps us discover the pattern of these metrics easier than contingency table.

Distribution of 5 numeric variables

The graph confirmed our previous funnel assumption that the distribution of Eyeballs, Requests and Completed Trips are slightly higher than their next ones.

Correlation of 5 numeric variables

By drawing a scatter plot matrix, we can identify if there is any correlations between all the numeric variables.

It looks like there is a positive correlation between demand(Eyeballs) and supply(Unique Drivers) based on the plot. So we will further calculate the correlation coefficient. A correlation coefficient of 0.79 suggests strong positive relationship between supply and demand.

cor(uberNum$Eyeballs, uberNum$Unique.Drivers)
## [1] 0.7895826

Exploration by Category

After examining numeric variables solely, we will analyze them by categorical groups, that, in this case, are Date and Time.

Most Completed Trips by Date and Time

It is clearly that completed trips centralize in weekends(Saturday and Sunday) and in peak hours(5pm - 3am), according to the graph.

The Gap between Demand and Supply

The core concept of the business is about the optimization between demand(Eyeballs) and supply(Unique Drivers). So we are going to explore the pattern of gap (uber$Eyeballs-uber$Unique.Drivers) by date and time. It is clearly that there are still lots of positive bars, especially in Friday night and Saturday.

Conclusion

For a more detailed learning for Uber Analytics, you could visit https://www.deskbright.com/uber/uber-analytics-test/).