DASH App Development using Uber data

Vitor dos Santos
6 min readNov 24, 2020

According to FinancesOnline, DASH App is a data visualization software that transforms data from custom and standard objects into an interactive system of clickable charts that show real-time information. Perfect for creating custom dashboards, DASH App enables users to easily navigate, view, comprehend and compare numerical data and come to intelligent, data-driven decisions.

Thus, DASH App is a perfect tool for data science applications, making it easier to analyze data in differently. In this article, we are going to show you how to build a Dash App using Google Collab. Besides, we will use as a study case a dataset from Uber, which informs the location, date, and time that a Uber driver was requested in NYC. The end goal of this application is to display an interactive web page that the user will be able to view data interestingly, something similar provided by Plotly|Dash demo.

Dataset

We will use a dataset that has information related to Uber, provided by FiveThirtyEight. This dataset has 4 columns:

  • Date/Time: The date and time of the Uber pickup.
  • Lat: The latitude of the Uber pickup
  • Lon: The longitude of the Uber pickup
  • Base: The TLC base company code affiliated with the Uber pickup.

In this application, we used the dataset from May 2014. However, you may use whatever dataset that you want to.

Pre-processing

First, we must perform pre-processing in the dataset. We will not use information related to the Base feature, so we will drop this column. Besides, the Date/Time column is represented as a string and have information display for both time and date. The figure below shows the first 5 samples of the database.

Figure 1: First 5 samples of the original dataset.

For our application, we will process the Date/Time column and create two other features that are related to the day of the month and the hour of the Uber pickup. Also, we will create another feature called Size which will be further used to plot the information. The code used to perform these modifications is shown below and the first 5 samples are shown in Figure 2.

import pandas as pddata = pd.read_csv("uber-raw-data-may14.csv")data['Date/Time'] = pd.to_datetime(data['Date/Time'])
data['Day'] = data['Date/Time'].dt.day
data['Hour'] = data['Date/Time'].dt.hour
data['Size'] = 1
data = data.drop(['Date/Time', 'Base'], axis=1)
Figure 2: First 5 samples after pre-processing.

Now, the next step is to filter all the occurrences from a specific day, which is necessary since we will show information based on this day. The code below shows this step considering that we want to filter information related to the 25th day of the month.

day = 25data_day = data.loc[data['Day'] == day]
Figure 3: First 5 samples of the new Dataframe.

Histogram

Now that we finished the pre-processing, we will build our interactive maps using Plotly, a collaborative, web-based graphing, and analytics platform.

First, we will create a Histogram that will display the number of Uber requests per hour of the specified day (in our example, 25th May). The code to generate the Histogram is shown below.

fig = px.histogram(data_day, x="Hour", color="Hour",          
color_discrete_sequence=px.colors.qualitative.Light24)
fig.show()

The first argument of the px.histogram function is the dataset. The variable “x” represents which feature we want to plot in the x-axis; the “color” variable will set one specific color for each feature; and the “color_discrete_sequence” sets the color order for each x-value. In this case, we used the default color sequence Light24 provided by Plotly. Running the above code will result in the Histogram shown in Figure 4.

Figure 4: Histogram of number of Uber pickups by hour.

That’s a colorful histogram, isn’t it? Besides, it provides us much information! For example, we can see that Uber requests are most requested during the afternoon, around 3 to 5 pm, which makes sense since many people are going out from work. Few uber calls were performed around 4 am, since people still at home resting.

Interactive Map

Now, let’s build an interactive map, which will use the latitude and longitude information to plot the exact position where the user requested the Uber drive. Besides, we will also use color to see the hour that the request was done. The code to generate this map is shown below:

fig = px.scatter_mapbox(data_day,lat='Lat',lon='Lon',size='Size',   
color='Hour',
color_continuous_scale=px.colors.sequential.Sunset,
size_max=5,zoom=12,mapbox_style='carto-positron')
fig.update_layout(margin={'r':0,'t':0,'l':0,'b':0})
fig.show()

The first argument is the dataset. The lat and lon arguments are the features that represent the latitude and the longitude of each sample, which is informed by the columns Lat and Lon, respectively. The size argument represents the size of each circle that will be represented on the map. Since in this application we want that each circle has the same size, we set all the values of the ‘Size’ column as 1 (as shown in the code provided in the previous section). The other arguments are related to the color and initial zoom that the map will be displayed. For further information about this function, you can access the documentation.

Running the code above resulted in the map shown in Figure 5, which displays all the positions that a Uber driver was requested in NYC on May 25th, 2014.

Figure 5: Interactive Map.

In this map, the color of each point will give a notion of the hour that the Uber was requested, as pointed by the left bar shown in Figure 5. Yellow points represent hours near dawn, red points represent hours in the afternoon and purple points represent hours in the night.

If the user lets the mouse over a specific dot, information related to the latitude, longitude, and hour will be shown, as pointed out by Figure 6.

Figure 6: Information related to a single point.

The GIF below shows an example of how the user can interact with the map, which is pretty interesting!

Figure 7: Using the interactive map.

The code used to build this application can be found on my GitHub. Feel free to download, test, and make any modifications that you want to! This code is developed based on a notebook from Andressa Stéfany. You can find her work at this link.

At the end of the notebook, there is also a section that will create a webpage with both histogram and the interactive map using Ngrok. Feel free to run that part of the code and interact with the page!

Conclusion

In this article, we explained how to develop some interactive applications using the Dash App. We were able to create a colorful histogram, which tells us the number of Uber requests over the hours; and an interesting map that provides interactive features for the user, allowing him to select the region of the map that the user wants to analyze better!

There are still some features that would be interesting to develop, such as allowing the user to select the day that he wants to analyze, filter the points of the map depending on the hour that the user selects, and so on. These features will be developed in future works.

I hope that you have enjoyed reading this article as much as I enjoyed making it! See ya!

References

--

--

Vitor dos Santos

PhD student on Computer Science at Dublin City University. Interested on Computer Vision, Deep Learning and Data Science.